Reconstructing Terrains: Scattered 2D Interpolation¶
In this exercise, we will do interpolation of height values associated with scattered data points. Scattered simply means that the data points are not organized in a grid which would make interpolation somewhat simpler. Instead, we will use the radial basis functions method.
from pygel3d import hmesh, jupyter_display as jd, spatial
import numpy as np
from math import log, exp
from scipy.linalg import solve
import plotly.offline as py
import plotly.graph_objs as go
jd.set_export_mode(True)
array = np.array
Create a grid¶
This function maps positions in a 2D grid into 3D by sampling a height map.
make_grid takes three arguments: dom_x and dom_y are tuples containing the start end points in the x and y domain of the height map respectively. hfun is the height map represented as a function that returns z coordinate when passed x and y. make_grid returns XV, YV, and ZV as a tuple. All three are 2D arrays containing the X, Y, and Z coordinates of the gridded height map respectively.
def make_grid(dom_x,dom_y, hfun):
X = np.linspace(*dom_x,int(10.0*dom_x[1]))
Y = np.linspace(*dom_y,int(10.0*dom_y[1]))
XV,YV = np.meshgrid(X,Y)
ZV = np.zeros(XV.shape)
for i,j in np.ndindex(XV.shape):
ZV[i,j] = hfun(XV[i,j], YV[i,j])
return XV, YV, ZV
Convert a 2D grid to a quad mesh¶
This function accepts a grid represented as three 2D arrays containing X, Y, and Z positions of each grid point. For each quadrilateral in the grid, we now create a face that is added to a new Manifold. Finally, the faces are stitched and the mesh is cleaned (removing the redundant vertices) and the mesh is finally returned.
def mesh_grid(XV,YV,ZV):
m = hmesh.Manifold()
def g2c(i,j):
return [XV[i,j],YV[i,j],ZV[i,j]]
for i,j in np.ndindex(tuple(n-1 for n in XV.shape)):
m.add_face([g2c(i,j), g2c(i,j+1), g2c(i+1,j+1), g2c(i+1,j)])
hmesh.stitch(m)
m.cleanup()
return m
Filtering points¶
The first function, create_2d_tree, puts the points into a k-D tree. This data structure makes it possible to find points that are close to another point efficiently.
filter_points function takes a bunch of points. For each point that has not been visited, it marks as visited all points within a small radius of that point. In this way, we avoid duplicate points which would otherwise cause numerical issues.
def create_2d_tree(pts):
tree = spatial.I3DTree()
for i in range(0,len(pts)):
p = array(pts[i])
p[2] = 0
tree.insert(p,i)
tree.build()
return tree
def filter_points(coords, rad):
tree = create_2d_tree(coords)
new_coords = []
visited = [False]*len(coords)
for i in range(0,len(coords)):
if not visited[i]:
p = array(coords[i])
new_coords += [array(p)]
p[2] = 0
(K,V) = tree.in_sphere(p,rad)
for idx in V:
visited[idx] = True
return new_coords
Draw a grid using Plotly¶
Plotly has a specific function for height maps (Surface) which makes it easy to visualize contours. We use this one in addition to our mesh visualization technique.
def draw_hm(XV,YV,ZV):
contour = go.Surface(x=XV,y=YV,z=ZV,contours=dict(z=dict(show=True)))
lyt = go.Layout(width=850,height=600)
lyt.scene.aspectmode="data"
lyt.scene.zaxis.nticks = 10
fig = go.Figure(data=[contour],layout=lyt)
py.iplot(fig)
Load and filter points¶
The code below loads a point set and transforms it to a domain that fits inside the unit cube. This makes numerics more robust. The coordinates could be very big which might lead to floating point errors.
f = open("./kote1-sorted.txt")
lines = f.readlines()
coords = []
for l in lines:
coords += [list(map(float, l.split()))]
point_mat = np.array(coords)
spanx = point_mat[:,0].max() - point_mat[:,0].min()
spany = point_mat[:,1].max() - point_mat[:,1].min()
span = max(spanx,spany)
coords -= np.array([point_mat[:,0].min(),point_mat[:,1].min(),point_mat[:,2].min()])
coords *= array([1.,1.,5.])/span
coords = filter_points(coords, 0.001)
--------------------------------------------------------------------------- NameError Traceback (most recent call last) Cell In[1], line 6 4 for l in lines: 5 coords += [list(map(float, l.split()))] ----> 6 point_mat = np.array(coords) 7 spanx = point_mat[:,0].max() - point_mat[:,0].min() 8 spany = point_mat[:,1].max() - point_mat[:,1].min() NameError: name 'np' is not defined
Part 1: Gaussian RBF¶
Reconstruct the terrain using gaussian radial basis functions. This involves constructing the A matrix below and implementing the function that sums the radial basis functions (evaluated at the given point) weighted with the coefficient you compute.
# We provide the Gaussian RBF below. Since we will need to evaluate rbf_gaussian
# with a value r=length(v) for a vector, v, and we really use the square of r, r^2,
# we opt instead to call the function with v directly and compute r^2 as the dot
# product of v with itself, i.e. r^2 = <v, v>
def rbf_gaussian(v):
inv_scale = 1000
return exp(-v@v*inv_scale)
N = len(coords)
A = np.zeros((N,N))
b = np.zeros(N)
### ---- Insert code here ----
# Below, fill the matrix A such that entry i,j is the value of the Gaussian RBF
# centered at i and evaluated at j. Also, construct the right hand side, i.e. the
# b vector.
for i in range(N):
for j in range(N):
A[i, j] = rbf_gaussian(coords[i][:2] - coords[j][:2])
b[i] = coords[i][2]
# The regularization below where a small constant time the identity
# is added is essential to making the Gaussian radial basis functions work.
A += 0.001 * np.eye(N)
# Finally, solve for the coefficients
### --------
lambda_ = np.linalg.solve(A, b)
def rbf_height_fun(x, y):
### ---- Insert code here ----
# Below, sum the radial basis functions evaluated at x,y and weighted with the
# coefficients just computed.
radials= [rbf_gaussian([x, y] - p_coords[:2]) for p_coords in coords]
h = np.dot(lambda_, radials)
return h
### --------
XV,YV,ZV= make_grid((0,spanx/span),(0,spany/span), rbf_height_fun)
draw_hm(XV,YV,ZV)
Part 1.2: Variations¶
Explore what effect inv_scale in rbf_gaussian has on the reconstruction. Try to set the value to something that is 10 times greater or smaller and watch what happens.
The effect of the inv_scale parameter in the Gaussian RBF function controls the spread of each basis function. A smaller inv_scale value (indicating a wider spread) would result in a smoother terrain surface, with each data point influencing a larger area. Conversely, a larger inv_scale value (indicating a narrower spread) makes each data point's influence more localized, which could capture more detail or noise in the terrain data.
Part 2: Create an RBF height map:¶
Find the coefficients of a Thin Plate Spline radial basis function interpolation of the terrain points. This is very similar to what you just did, but using the TPS RBF instead of the Gaussian.
def rbf(v):
r = np.sqrt(np.dot(v, v))
return r**2 * np.log(r) if r > 0 else 0.0
N = len(coords)
A = np.zeros((N+3, N+3))
b = np.zeros(N+3)
### ---- Solution begin ----
# Construct the A matrix and the b vector
# Solve for the rbf coefficients
N = len(coords)
A = np.zeros((N+3, N+3))
b = np.zeros(N+3)
for i in range(N):
for j in range(N):
A[i, j] = rbf(coords[i][:2] - coords[j][:2])
b[i] = coords[i][2]
for i in range(N):
A[i, N:N+3] = [1, coords[i][0], coords[i][1]]
A[N:N+3, i] = [1, coords[i][0], coords[i][1]]
coefficients = np.linalg.solve(A, b)
def rbf_height_fun(x, y):
h = sum(coefficients[i] * rbf(np.array([x, y]) - coords[i][:2]) for i in range(N))
# Add the polynomial part
h += coefficients[-3] + coefficients[-2] * x + coefficients[-1] * y
return h
XV,YV,ZV= make_grid((0,spanx/span),(0,spany/span), rbf_height_fun)
draw_hm(XV,YV,ZV)
Questions:¶
- Write your observations regarding the difference between Gaussian and Thin Plate Spline radial basis functions.
- Is it possible to represent a constant height function using Gaussian radial basis functions?
- What is the effect of changing the
inv_scalefactor? In particular, what happens wheninv_scaleis set to something very large? - Why do we add the polynomial terms in the case of Thin Plate Spline RBFs?
- What would you say are the main trade-offs, i.e. pros and cons regarding the RBF method in general? (Hint: you can think beyond 2D)
Write your observations regarding the difference between Gaussian and Thin Plate Spline radial basis functions.¶
The Gaussian RBF, known for its bell-shaped curve, excels in creating smooth, locally influenced surfaces due to its formula ψ(r)=exp(−αr^2) that adjusts based on distance and a shape parameter. It's ideal for tasks needing refined, globally smooth surfaces across multiple dimensions. Thin Plate Spline RBF, on the other hand, mimics the bending of a thin plate, offering a more flexible approach to interpolating unevenly spaced data, especially in two dimensions. It balances fitting data with minimizing surface curvature, making it less prone to overfitting and excellent for 2D and 2.5D surface reconstructions like terrain
Is it possible to represent a constant height function using Gaussian radial basis functions?¶
To represent a constant height function, it's necessary to set α to a very large value so that the exponential term in the Gaussian function approaches zero quickly as r increases. In other words, the Gaussian function has a very narrow peak, effectively behaving like a constant function over a small region around each data point. In practice, when using RBFs for surface reconstruction, Gaussian RBFs are typically used to capture smooth variations in the surface rather than constant features. Other types of RBFs, such as thin plate spline RBFs, may be more suitable for interpolating constant height functions or approximating sharp features in the surface.
What is the effect of changing the inv_scale factor? In particular, what happens when inv_scale is set to something very large?¶
'inv_scale' is the reciprocal of α in ψ(r)=exp(−αr^2). So, when inv_scale is set to a very large value, it means α is very small.
The effect is primarily on the width of the Gaussian function's peak. When inv_scale is small, the Gaussian function becomes very narrow, and its peak becomes taller. It means that the influence of each data point on the interpolation decreases rapidly with distance from the point. Setting inv_scale to a very large value in Gaussian RBFs tends to create sharp, potential overfitting and loss of smoothness in the interpolated surface.
Why do we add the polynomial terms in the case of Thin Plate Spline RBFs?¶
to ensure the uniqueness of the solution to the interpolation problem. For certain types of RBFs, including TPS, the addition of a low-degree linear polynomial ensures that the interpolation system of equations has a unique solution. Then the polynomial terms allow the RBF model to exactly reproduce polynomial surfaces up to a certain degree. Without these terms, the model might not be able to exactly interpolate or approximate polynomial functions. Plus, it increases the flexibility of the RBF model, enabling it to represent a wider variety of shapes and surfaces.
What would you say are the main trade-offs, i.e. pros and cons regarding the RBF method in general? (Hint: you can think beyond 2D)¶
Pros: It is highly flexible and can interpolate or approximate functions with complex, irregular shapes over arbitrary dimensions. Additionally. it can provide very accurate interpolations, especially when the data points are dense and evenly distributed. Then, RBF interpolation typically results in smooth and continuous surfaces.
Cons: The computational cost of solving the RBF interpolation problem can be high, especially for large datasets. The choice of RBF parameters (e.g., the shape parameter) significantly affects the quality of the interpolation. Finding the optimal parameter values can be challenging. RBF interpolation is inherently global; the value of the interpolating function at any given point is influenced by all data points in the dataset. This global nature can lead to unwanted artifacts if the dataset contains outliers or noise